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1. arXiv:2211.14097 [pdf, other] 


Variance change point detection with credible sets 

Authors: Lorenzo Cappello, Oscar Hernan Madrid Padilla 

Abstract: This paper introduces a novel Bayesian approach to detect changes in the variance of a Gaussian sequence 
model, focusing on quantifying the uncertainty in the change point locations and providing a scalable algorithm for 
inference. We do that by framing the problem as a product of multiple single changes in the scale parameter. We fit 
the model through an iterative procedure similar to what is don... y More 


Submitted 25 November, 2022; originally announced November 2022. 


2. arXiv:2210.16049 [pdf, other] : cs.Al stat.ML 


Measuring the Confidence of Traffic Forecasting Models: Techniques, Experimental 
Comparison and Guidelines towards Their Actionability 

Authors: Ibai Laña, Ignacio, Olabarrieta, Javier Del Ser 

Abstract: ...evidence, this critical discussion is further informed by experimental results produced by different 
uncertainty estimation techniques over real traffic data collected in Madrid (Spain), rendering a general overview of 
the benefits and caveats of every technique, how they can be compared to each other, and how the measured 
uncertainty decreases depending on... y More 

Submitted 28 October, 2022; originally announced October 2022. 


Comments: 46 pages, 12 figures, under review 


3. arXiv:2208.03675 [pdf, other] math.ST stat.ML 
Kernel Biclustering algorithm in Hilbert Spaces 
Authors: Marcos Matabuena, J. C Vidal, Oscar Hernan Madrid Padilla, Dino Sejdinovic 


Abstract: Biclustering algorithms partition data and covariates simultaneously, providing new insights in several 
domains, such as analyzing gene expression to discover new biological functions. This paper develops a new model- 
free biclustering algorithm in abstract spaces using the notions of energy distance (ED) and the maximum mean 
discrepancy (MMD) -- two distances between probability distributions capa... y More 

Submitted 7 August, 2022; originally announced August 2022. 


4. arXiv:2207.12638 [pdf, other] cs.LG stat.ML 
Variance estimation in graphs with the fused lasso 


Authors: Oscar Hernan Madrid Padilla 


Abstract: We study the problem of variance estimation in general graph-structured problems. First, we develop a 
linear time estimator for the homoscedastic case that can consistently estimate the variance in general graphs. We 
show that our estimator attains minimax rates for the chain and 2D grid graphs when the mean signal has a total 
variation with canonical scaling. Furthermore, we provide general upper... Y More 

Submitted 29 August, 2022; v1 submitted 25 July, 2022; originally announced July 2022. 


. arXiv:2206.12701 [pdf, other] stat.ML doi 10.1145/3539813.3545142 


The Bandwagon Effect: Not Just Another Bias 
Authors: Norman Knyazev, Harrie Oosterhuis 


Abstract: Optimizing recommender systems based on user interaction data is mainly seen as a problem of dealing 
with selection bias, where most existing work assumes that interactions from different users are independent. 
However, it has been shown that in reality user feedback is often influenced by earlier interactions of other users, e.g. 
via average ratings, number of views or sales per item, etc. This p... v More 

Submitted 1 July, 2022; v1 submitted 25 June, 2022; originally announced June 2022. 


Comments: In Proceedings of the 2022 ACM SIGIR International Conference on the Theory of Information Retrieval (ICTIR '22), July 11-12, 
2022, Madrid, Spain. ACM, New York, NY, USA, 11 pages. https://doi.org/10.1145/3539813.3545142 ACM ISBN 978-1-4503-9412-3/22/07 


. arXiv:2206.09092 [pdf, other] math.ST 


Dynamic and heterogeneous treatment effects with abrupt changes 
Authors: Oscar Hernan Madrid Padilla, Yi Yu 


Abstract: From personalised medicine to targeted advertising, it is an inherent task to provide a sequence of decisions 
with historical covariates and outcome data. This requires understanding of both the dynamics and heterogeneity of 
treatment effects. In this paper, we are concerned with detecting abrupt changes in the treatment effects in terms of 
the conditional average treatment effect (CATE) in a sequ... y More 

Submitted 17 June, 2022; originally announced June 2022. 


. arXiv:2205.13651 [pdf, other] 


A Partially Separable Temporal Model for Dynamic Valued Networks 
Authors: Yik Lun Kei, Yanzhen Chen, Oscar Hernan Madrid Padilla 


Abstract: The Exponential-family Random Graph Model (ERGM) is a powerful statistical model to represent the 
complicated structural dependencies of a binary network observed at a single time point. However, regarding 
dynamic valued networks whose observations are matrices of counts that evolve over time, the development of the 
ERGM framework is still in its infancy. We propose a Partially Separable Temporal... v More 

Submitted 10 November, 2022; v1 submitted 26 May, 2022; originally announced May 2022. 


. arXiv:2205.09252 [pdf, other] 


Change-point Detection for Sparse and Dense Functional Data in General Dimensions 
Authors: Carlos Misael Madrid Padilla, Daren Wang, Zifeng Zhao, Yi Yu 


Abstract: We study the problem of change-point detection and localisation for functional data sequentially observed 
on a general d-dimensional space, where we allow the functional curves to be either sparsely or densely sampled. 
Data of this form naturally arise in a wide range of applications such as biology, neuroscience, climatology, and 
finance. To achieve such a task, we propose a kernel-based algorith... v More 

Submitted 18 May, 2022; originally announced May 2022. 


. arXiv:2205.09094 [pdf, other] 


Non-asymptotic confidence bands on the probability an individual benefits from treatment 
(PIBT) 

Authors: Gabriel Ruiz, Oscar Hernan Madrid Padilla 

Abstract: The premise of this work, in a vein similar to predictive inference with quantile regression, is that 
observations may lie far away from their conditional expectation. In the context of causal inference, due to the 
missing-ness of one outcome, it is difficult to check whether an individual's treatment effect lies close to its prediction 
given by the estimated Average Treatment Effect (ATE) or Cond... y More 


Submitted 21 May, 2022; v1 submitted 18 May, 2022; originally announced May 2022. 
Comments: 16 pages, 4 figures 
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arXiv:2202.01748 [pdf, other] cs.LG stat.ML 


Sequentially learning the topological ordering of causal directed acyclic graphs with 
likelihood ratio scores 

Authors: Gabriel Ruiz, Oscar Hernan Madrid Padilla, Qing Zhou 

Abstract: Causal discovery, the learning of causality in a data mining scenario, has been of strong scientific and 
theoretical interest as a starting point to identify "what causes what?" Contingent on assumptions and a proper 
learning algorithm, it is sometimes possible to identify and accurately estimate a causal directed acyclic graph (DAG), 
as opposed to a Markov equivalence class of graphs that gives a... v More 

Submitted 19 May, 2022; v1 submitted 3 February, 2022; originally announced February 2022. 


. arXiv:2110.08665 [pdf, other] math.ST 


Quantile Regression by Dyadic CART 

Authors: Oscar Hernan Madrid Padilla, Sabyasachi Chatterjee 

Abstract: In this paper we propose and study a version of the Dyadic Classification and Regression Trees (DCART) 
estimator from Donoho (1997) for (fixed design) quantile regression in general dimensions. We refer to this proposed 
estimator as the QDCART estimator. Just like the mean regression version, we show that a) a fast dynamic 
programming based algorithm with computational complexity O(N log N) exi... v More 

Submitted 16 October, 2021; originally announced October 2021. 


arXiv:2110.02401 [pdf, other] 


2D score based estimation of heterogeneous treatment effects 

Authors: Steven Siwei Ye, Yanzhen Chen, Oscar Hernan Madrid Padilla 

Abstract: In the study of causal inference, statisticians show growing interest in estimating and analyzing 
heterogeneity in causal effects in observational studies. However, there usually exists a trade-off between accuracy 
and interpretability for developing a desirable estimator for treatment effects, especially in the case when there are a 
large number of features in estimation. To make efforts to addre... y More 

Submitted 5 October, 2022; v1 submitted 5 October, 2021; originally announced October 2021. 


arXiv:2110.00901 [pdf, other] 


A causal fused lasso for interpretable heterogeneous treatment effects estimation 
Authors: Oscar Hernan Madrid Padilla, Yanzhen Chen, Gabriel Ruiz 

Abstract: We propose a novel method for estimating heterogeneous treatment effects based on the fused lasso. By 
first ordering samples based on the propensity or prognostic score, we match units from the treatment and control 
groups. We then run the fused lasso to obtain piecewise constant treatment effects with respect to the ordering 
defined by the score. Similar to the existing methods based on discret... v More 

Submitted 14 July, 2022; v1 submitted 2 October, 2021; originally announced October 2021. 


arXiv:2107.10398 [pdf, other] € physics.med-ph q-bio.PE stat.AP 


On the Use of Time Series Kernel and Dimensionality Reduction to Identify the Acquisition 
of Antimicrobial Multidrug Resistance in the Intensive Care Unit 

Authors: Óscar Escudero-Arnanz, Joaquín Rodríguez-Álvarez, Karl Øyvind Mikalsen, Robert Jenssen, Cristina Soguero- 
Ruiz 

Abstract: ...is a major global concern. This study analyses data in the form of multivariate time series (MTS) from 3476 
patients recorded at the ICU of University Hospital of Fuenlabrada (Madrid) from 2004 to 2020. 18\% of the patients 
acquired AMR during their stay in the ICU. The goal of this paper is an early prediction of the development of AMR. 
Towards that end, we... y More 

Submitted 7 July, 2021; originally announced July 2021. 


arXiv:2106.13685 [pdf, other] s stat.ML 


Feature Grouping and Sparse Principal Component Analysis with Truncated Regularization 
Authors: Haiyan Jiang, Shanshan Qin, Oscar Hernan Madrid Padilla 

Abstract: In this paper, we consider a new variant for principal component analysis (PCA), aiming to capture the 
grouping and/or sparse structures of factor loadings simultaneously. To achieve these goals, we employ a non-convex 
truncated regularization with naturally adjustable sparsity and grouping effects, and propose the Feature Grouping 
and Sparse Principal Component Analysis (FGSPCA). The proposed FGS... v More 
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Submitted 13 September, 2022; v1 submitted 25 June, 2021; originally announced June 2021. 
Comments: 19 pages, 3 figures 


arXiv:2106.10383 [pdf, other] 


Scalable Bayesian change point detection with spike and slab priors 

Authors: Lorenzo Cappello, Oscar Hernan Madrid Padilla, Julia A. Palacios 

Abstract: We study the use of spike and slab priors for consistent estimation of the number of change points and 
their locations. Leveraging recent results in the variable selection literature, we show that an estimator based on spike 
and slab priors achieves optimal localization rate in the multiple offline change point detection problem. Based on this 
estimator, we propose a Bayesian change point detectio... v More 

Submitted 18 June, 2021; originally announced June 2021. 


arXiv:2106.01271 [pdf, other] e eess.SY stat.ML doi 10.1109/PowerTech46648.2021.9494976 


Deep learning-based multi-output quantile forecasting of PV generation 

Authors: Jonathan Dumas, Colin Cointe, Xavier Fettweis, Bertrand Cornélusse 

Abstract: This paper develops probabilistic PV forecasters by taking advantage of recent breakthroughs in deep 
learning. It tailored forecasting tool, named encoder-decoder, is implemented to compute intraday multi-output PV 
quantiles forecasts to efficiently capture the time correlation. The models are trained using quantile regression, a non- 
parametric approach that assumes no prior knowledge of the proba... v More 

Submitted 7 June, 2021; v1 submitted 2 June, 2021; originally announced June 2021. 

Journal ref: 2021 IEEE Madrid PowerTech 


arXiv:2105.13504 [pdf, other] t cs.LG stat.ML 


Lattice partition recovery with dyadic CART 

Authors: Oscar Hernan Madrid Padilla, Yi Yu, Alessandro Rinaldo 

Abstract: We study piece-wise constant signals corrupted by additive Gaussian noise over a d-dimensional lattice. 
Data of this form naturally arise in a host of applications, and the tasks of signal detection or testing, de-noising and 
estimation have been studied extensively in the statistical and signal processing literature. In this paper we consider 
instead the problem of partition recovery, i.e.~of e... v More 

Submitted 27 October, 2021; v1 submitted 27 May, 2021; originally announced May 2021. 


arXiv:2105.10890 [pdf, other] stat.AP stat.cO 


Bayesian Effect Selection for Additive Quantile Regression with an Analysis to Air Pollution 
Thresholds 

Authors: Nadja Klein, Jorge Mateu 

Abstract: ...analysts' decision whether an effect should be included linearly, non-linearly or not at all in the quantiles 
of interest. In a detailed analysis on air pollution data in Madrid (Spain) we find the added value of modelling extreme 
nitrogen dioxide (NO2) concentrations and how thresholds are driven differently by several climatological variables 
and traff... v More 

Submitted 23 May, 2021; originally announced May 2021. 


arXiv:2105.08186 [pdf, other] math.ST doi 10.1007/s10651-022-00539-2 


Distribution-free changepoint detection tests based on the breaking of records 

Authors: Jorge Castillo-Mateo 

Abstract: ...proposed. A Monte Carlo study of size, power and changepoint estimate has been performed. Finally, the 
methods are illustrated by analyzing the time series of temperatures at Madrid, Spain. The R package RecordTest 
publicly available on CRAN implements the proposed methods. v More 

Submitted 6 July, 2022; v1 submitted 17 May, 2021; originally announced May 2021. 

Comments: 22 pages, 7 figures, 2 tables; major revision 

Journal ref: Environmental and Ecological Statistics 29(3), 655-676 (2022) 


arXiv:2101.09351 [pdf] s physics.data-an stat.AP doi 10.1016/j.uclim.2021.100921 


Hourly evolution of intra-urban temperature variability across the local climate zones. The 
case of Madrid 


Authors: Miguel Núñez-Peiró, Carmen Sanchez-Guevara Sanchez, F. Javier Neila Gonzalez 


22. 


23. 


24. 


25. 


26. 


Abstract: ...indicators for urban temperature variability detection. In this respect, the present study introduces the 
results of an extensive monitoring campaign carried out in the city of Madrid over a two-year period (2016-2018). The 
aim of this work is to further examine the relationships between LCZs and air temperature differences, with emphasis 
on their hourly and... v More 


Submitted 30 December, 2020; originally announced January 2021. 
Comments: 7 figures, 8 tables, 1 appendix 


arXiv:2012.01758 [pdf, other] s math.ST 


Non-parametric Quantile Regression via the K-NN Fused Lasso 

Authors: Steven Siwei Ye, Oscar Hernan Madrid Padilla 

Abstract: Quantile regression is a statistical method for estimating conditional quantiles of a response variable. In 
addition, for mean estimation, it is well known that quantile regression is more robust to outliers than la-based 
methods. By using the fused lasso penalty over a K-nearest neighbors graph, we propose an adaptive quantile 
estimator in a non-parametric setup. We show that the estimator a... v More 

Submitted 17 August, 2021; v1 submitted 3 December, 2020; originally announced December 2020. 

Journal ref: Journal of Machine Learning Research, Vol. 22, No. 111, 1-38, 2021 


arXiv:2010.08236 [pdf, other] stat.ML 


Quantile regression with deep ReLU Networks: Estimators and minimax rates 

Authors: Oscar Hernan Madrid Padilla, Wesley Tansey, Yanzhen Chen 

Abstract: Quantile regression is the task of estimating a specified percentile response, such as the median, from a 
collection of known covariates. We study quantile regression with rectified linear unit (ReLU) neural networks as the 
chosen model class. We derive an upper bound on the expected mean squared error of a ReLU network used to 
estimate any quantile conditional on a set of covariates. This upper b... v More 

Submitted 17 December, 2020; v1 submitted 16 October, 2020; originally announced October 2020. 


arXiv:2010.06538 [pdf, other] ste cs.LG math.OC stat.ML 


Modeling Atmospheric Data and Identifying Dynamics: Temporal Data-Driven Modeling of 
Air Pollutants 

Authors: Javier Rubio-Herrero, Carlos Ortiz Marrero, Wai-Tong Louis Fan 

Abstract: ...the physical laws that govern their behaviors and relationships remain hidden. With the aid of real-world 
air quality data collected hourly in different stations throughout Madrid, we present an empirical approach using 
data-driven techniques with the following goals: (1) Find parsimonious systems of ordinary differential equations via 
sparse identification... v More 

Submitted 6 July, 2021; v1 submitted 13 October, 2020; originally announced October 2020. 

Report number: PNNL-SA-157007 


arXiv:2004.13695 [pdf, other] q-bio.QM stat.AP 


COVID-19: Estimating spread in Spain solving an inverse problem with a probabilistic model 
Authors: Marcos Matabuena, Carlos Meijide-Garcia, Pablo Rodriguez-Mier, Victor Leborán 

Abstract: ...percent of the population may be contaminated or have already been recovered from the virus in Madrid, 
one of the most affected regions in Spain. However, if we assume that the number of fatalities is twice as high as the 
official numbers, the number of infections could have reached 19.5%. In Galicia, one of the regions where the effect 
has been the leas... v More 

Submitted 3 May, 2020; v1 submitted 28 April, 2020; originally announced April 2020. 

Comments: 36 pag 


arXiv:2004.03384 [pdf, ps, other] stat.AP 


Covid-19 -- A simple statistical model for predicting ICU load in early phases of the disease 
Authors: Matthias Ritter, Derek V. M. Ott, Friedemann Paul, John-Dylan Haynes, Kerstin Ritter 


Abstract: ...allows for making predictions depending on different future growth of infections. We have evaluated our 
model for three regions, namely Berlin (Germany), Lombardy (Italy), and Madrid (Spain). Before extensive 
containment measures made an impact, we first estimate the region-specific model parameters. Whereas for Berlin, 
an ICU rate of 6%, a time lag of 6 day... v More 

Submitted 27 July, 2020; v1 submitted 6 April, 2020; originally announced April 2020. 


27. arXiv:1912.04160 [pdf, other] 
Energy distance and kernel mean embeddings for two-sample survival testing 
Authors: Marcos Matabuena, Oscar Hernan Madrid Padilla 
Abstract: We study the comparison problem of distribution equality between two random samples under a right 
censoring scheme. To address this problem, we design a series of tests based on energy distance and kernel mean 
embeddings. We calibrate our tests using permutation methods and prove that they are consistent against all fixed 
continuous alternatives. To evaluate our proposed tests, we simulate surviva... v More 


Submitted 9 December, 2019; originally announced December 2019. 


28. arXiv:1912.02151 [pdf, other] math.ST stat.ME 


High Dimensional Latent Panel Quantile Regression with an Application to Asset Pricing 
Authors: Alexandre Belloni, Mingli Chen, Oscar Hernan Madrid Padilla, Zixuan, Wang 

Abstract: We propose a generalization of the linear panel quantile regression model to accommodate both 
\textit{sparse} and \textit{dense} parts: sparse means while the number of covariates available is large, potentially 
only amuch smaller number of them have a nonzero impact on each conditional quantile of the response variable; 
while the dense part is represent by a low-rank matrix that can be approxima... v More 


Submitted 23 August, 2022; v1 submitted 4 December, 2019; originally announced December 2019. 
Comments: forthcoming at the Annals of Statistics 


29. arXiv:1911.08056 [pdf, other] G stat.ML 


Modelling pressure-Hessian from local velocity gradients information in an incompressible 
turbulent flow field using deep neural networks 

Authors: Nishant Parashar, Sawan S. Sinha, Balaji Srinivasan 

Abstract: ...(JHU turbulence database, JHTD). The predictions made by the TBNN are tested against two different 
isotropic turbulence datasets at Reynolds number of 433 (JHTD) and 315 (UP Madrid turbulence database, UPMTD) 
and channel flow dataset at Reynolds number of 1000 (UT Texas and JHTD). The evaluation of the neural network 
output is made in terms of the alignment... v More 

Submitted 18 November, 2019; originally announced November 2019. 


30. arXiv:1911.07494 [pdf, other] s math.ST 


Change point localization in dependent dynamic nonparametric random dot product 
graphs 

Authors: Oscar Hernan Madrid Padilla, Yi Yu, Carey E. Priebe 

Abstract: In this paper, we study the offline change point localization problem in a sequence of dependent 
nonparametric random dot product graphs. To be specific, assume that at every time point, a network is generated 
from a nonparametric random dot product graph model \citep[see e.g.][]}{athreya201 7statistical}, where the latent 
positions are generated from unknown underlying distributions. The underlying... v More 

Submitted 15 September, 2022; v1 submitted 18 November, 2019; originally announced November 2019. 


31. arXiv:1910.13289 [pdf, other] t stat.ME 


Optimal nonparametric multivariate change point detection and localization 
Authors: Oscar Hernan Madrid Padilla, Yi Yu, Daren Wang, Alessandro Rinaldo 


Abstract: We study the multivariate nonparametric change point detection problem, where the data are a sequence 
of independent p-dimensional random vectors whose distributions are piecewise-constant with Lipschitz densities 
changing at unknown times, called change points. We quantify the size of the distributional change at any change 
point with the supremum norm of the difference between the correspondin... v More 

Submitted 25 June, 2020; v1 submitted 29 October, 2019; originally announced October 2019. 


32. arXiv:1907.10772 [pdf, other] LG cs.Al stat.ML 


Towards AutoML in the presence of Drift: first results 

Authors: Jorge G. Madrid, Hugo Jair Escalante, Eduardo F. Morales, Wei-Wei Tu, Yang Yu, Lisheng Sun-Hosoya, Isabelle 
Guyon, Michele Sebag 

Abstract: Research progress in AUtoML has lead to state of the art solutions that can cope quite wellwith supervised 
learning task, e.g., classification with AutoSklearn. However, so far thesesystems do not take into account the 


33. 


34. 


35. 


36. 


37. 


38. 


changing nature of evolving data over time (i.e., theystill assume i.i.d. data); even when this sort of domains are 
increasingly available in realapplications (e.g., spam filtering,... V More 


Submitted 24 July, 2019; originally announced July 2019. 
Comments: AutoML 2018 @ ICML/IJCAI-ECAI 


arXiv:1906.08934 [pdf, other] LG ces.CL stat.ML 


Meta-learning of textual representations 

Authors: Jorge Madrid, Hugo Jair Escalante, Eduardo Morales 

Abstract: Recent progress in AutoML has lead to state-of-the-art methods (e.g., AutoSKLearn) that can be readily used 
by non-experts to approach any supervised learning problem. Whereas these methods are quite effective, they are 
still limited in the sense that they work for tabular (matrix formatted) data only. This paper describes one step forward 
in trying to automate the design of supervised learning me... v More 

Submitted 19 July, 2019; v1 submitted 20 June, 2019; originally announced June 2019. 


arXiv:1905.10848 [pdf, ps, other] I cs.LG 


Learning Gaussian DAGs from Network Data 

Authors: Hangjian Li, Oscar Hernan Madrid Padilla, Qing Zhou 

Abstract: Structural learning of directed acyclic graphs (DAGs) or Bayesian networks has been studied extensively 
under the assumption that data are independent. We propose a new Gaussian DAG model for dependent data which 
assumes the observations are correlated according to an undirected network. Under this model, we develop a 
method to estimate the DAG structure given a topological ordering of the nodes.... v More 

Submitted 28 July, 2021; v1 submitted 26 May, 2019; originally announced May 2019. 

Comments: 14 pages, 5 figures 


arXiv:1905.10019 [pdf, other] stat math.ST 


Optimal nonparametric change point detection and localization 

Authors: Oscar Hernan Madrid Padilla, Yi Yu, Daren Wang, Alessandro Rinaldo 

Abstract: We study change point detection and localization for univariate data in fully nonparametric settings in 
which, at each time point, we acquire an i.i.d. sample from an unknown distribution. We quantify the magnitude of the 
distributional changes at the change points using the Kolmogorov--Smirnov distance. We allow all the relevant 
parameters -- the minimal spacing between two consecutive change poi... v More 


Submitted 23 May, 2019; originally announced May 2019. 
MSC Class: Change point detection; Minimax optimality 


arXiv:1903.11647 [pdf, other] 


Approximate Bayesian inference for multivariate point pattern analysis in disease mapping 
Authors: Francisco Palmi-Perales, Virgilio Gomez-Rubio, Gonzalo Lopez-Abente, Rebeca Ramis-Prieto, Jose Miguel 
Sanz-Anquela, Pablo Fernandez-Navarro 

Abstract: ...Partial Differential Equations (SPDE). Finally, this new framework is applied to a dataset of three different 
types of cancer and a set of controls from Alcala de Henares (Madrid, Spain). Covariates available include the distance 
to several polluting industries and socioeconomic indicators. Our findings point to a possible risk increase due to the 
proximity... v More 

Submitted 27 March, 2019; originally announced March 2019. 


arXiv:1810.11042 [pdf, other] doi 10.1093/biomet/asab014 


Optimal post-selection inference for sparse signals: a nonparametric empirical-Bayes 
approach 

Authors: Spencer Woody, Oscar Hernan Madrid Padilla, James G. Scott 

Abstract: Many recently developed Bayesian methods have focused on sparse signal detection. However, much less 
work has been done addressing the natural follow-up question: how to make valid inferences for the magnitude of 
those signals after selection. Ordinary Bayesian credible intervals suffer from selection bias, owing to the fact that the 
target of inference is chosen adaptively. Existing Bayesian appr... v More 

Submitted 13 November, 2020; v1 submitted 25 October, 2018; originally announced October 2018. 


arXiv:1809.04933 [pdf, other] ste cs.LG stat.ML doi 10.3390/app8112321 


39. 


40. 


41. 


42. 


43. 


Identifying Real Estate Opportunities using Machine Learning 
Authors: Alejandro Baldominos, Iván Blanco, Antonio José Moreno, Rubén Iturrarte, Óscar Bernárdez, Carlos Afonso 


Abstract: ...This program can be useful for investors interested in the housing market. We have focused in a use case 
considering real estate assets located in the Salamanca district in Madrid (Spain) and listed in the most relevant 
Spanish online site for home sales and rentals. The application is formally implemented as a regression problem that 
tries to estimate the... v More 

Submitted 21 November, 2018; v1 submitted 13 September, 2018; originally announced September 2018. 

Comments: 24 pages, 13 figures, 5 tables 


Journal ref: Baldominos, A.; Blanco, l.; Moreno, A.J.; Iturrarte, R.; Bernárdez, Ó.; Afonso, C. Identifying Real Estate Opportunities Using 
Machine Learning. Appl. Sci. 2018, 8, 2321 


arXiv:1807.11641 [pdf, other] 


Adaptive Non-Parametric Regression With the K-NN Fused Lasso 

Authors: Oscar Hernan Madrid Padilla, James Sharpnack, Yanzhen Chen, Daniela M. Witten 

Abstract: The fused lasso, also known as total-variation denoising, is a locally-adaptive function estimator over a 
regular grid of design points. In this paper, we extend the fused lasso to settings in which the points do not occur ona 
regular grid, leading to an approach for non-parametric regression. This approach, which we call the K-nearest 
neighbors (-NN) fused lasso, involves (i) computing the... v More 

Submitted 8 July, 2019; v1 submitted 30 July, 2018; originally announced July 2018. 


arXiv:1805.12338 [pdf, other] RO eess.SP stat.ML doi 10.1109/IROS.2018.8594399 


Hallucinating robots: Inferring Obstacle Distances from Partial Laser Measurements 
Authors: Jens Lundell, Francesco Verdoja, Ville Kyrki 

Abstract: Many mobile robots rely on 2D laser scanners for localization, mapping, and navigation. However, those 
sensors are unable to correctly provide distance to obstacles such as glass panels and tables whose actual occupancy 
is invisible at the height the sensor is measuring. In this work, instead of estimating the distance to obstacles from 
richer sensor readings such as 3D lasers or RGBD sensors, we... v More 

Submitted 29 July, 2018; v1 submitted 31 May, 2018; originally announced May 2018. 

Comments: In 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 

Journal ref: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain, 2018, pp. 4781-4787 


arXiv:1805.07042 [pdf, other] 


Graphon estimation via nearest neighbor algorithm and 2D fused lasso denoising 

Authors: Oscar Hernan Madrid Padilla 

Abstract: We propose a class of methods for graphon estimation based on exploiting connections with nonparametric 
regression. The idea is to construct an ordering of the nodes in the network, similar in spirit to Chan and Airoldi 
(2014). However, rather than only considering orderings based on the empirical degree as in Chan and Airoldi (2014), 
we use the nearest neighbor algorithm which is an approximating... v More 

Submitted 18 June, 2019; v1 submitted 18 May, 2018; originally announced May 2018. 


arXiv:1803.00967 [pdf, other] ê cs.Al cs.LG stat.AP stat.ML 


Active model learning and diverse action sampling for task and motion planning 

Authors: Zi Wang, Caelan Reed Garrett, Leslie Pack Kaelbling, Tomas Lozano-Pérez 

Abstract: The objective of this work is to augment the basic abilities of a robot by learning to use new sensorimotor 
primitives to enable the solution of complex long-horizon problems. Solving long-horizon problems in complex 
domains requires flexible generative planning that can combine primitive abilities in novel combinations to solve 
problems as they arise in the world. In order to plan to combine prim... y More 

Submitted 12 August, 2018; v1 submitted 2 March, 2018; originally announced March 2018. 

Comments: Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain. 


https://www.youtube.com/playlist?list=PLoWhBFPMfSzDbc8CYelsbHZa1 d3uz-W_c 
arXiv:1612.07867 [pdf, other] 


Sequential nonparametric tests for a change in distribution: an application to detecting 
radiological anomalies 


Authors: Oscar Hernan Madrid Padilla, Alex Athey, Alex Reinhart, James G. Scott 


44. 


45. 


46. 


47. 


48. 


Abstract: We propose a sequential nonparametric test for detecting a change in distribution, based on windowed 
Kolmogorov--Smirnov statistics. The approach is simple, robust, highly computationally efficient, easy to calibrate, and 
requires no parametric assumptions about the underlying null and alternative distributions. We show that both the 
false-alarm rate and the power of our procedure are amenable to... v More 

Submitted 22 December, 2016; originally announced December 2016. 


arXiv:1511.06750 [pdf, other] 


A deconvolution path for mixtures 

Authors: Oscar Hernan Madrid Padilla, Nicholas G. Polson, James G. Scott 

Abstract: We propose a class of estimators for deconvolution in mixture models based on a simple two-step "bin-and- 
smooth" procedure applied to histogram counts. The method is both statistically and computationally efficient: by 
exploiting recent advances in convex optimization, we are able to provide a full deconvolution path that shows the 
estimate for the mixing distribution across a range of plausible d... v More 
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Nonparametric density estimation by histogram trend filtering 

Authors: Oscar Hernan Madrid Padilla, James G. Scott 

Abstract: We propose a novel approach for density estimation called histogram trend filtering. Our estimator arises 
from looking at surrogate Poisson model for counts of observations in a partition of the support of the data. We begin 
by showing consistency for a variational estimator for this density estimation problem. We then study a discrete 
estimator that can be efficiently found via convex optimizatio... v More 
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Vector-Space Markov Random Fields via Exponential Families 

Authors: Wesley Tansey, Oscar Hernan Madrid Padilla, Arun Sai Suggala, Pradeep Ravikumar 

Abstract: We present Vector-Space Markov Random Fields (VS-MRFs), a novel class of undirected graphical models 
where each variable can belong to an arbitrary vector space. VS-MRFs generalize a recent line of work on scalar- 
valued, uni-parameter exponential family and mixed graphical models, thereby greatly broadening the class of 
exponential families available (e.g., allowing multinomial and Dirichlet distr... y More 
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Tensor decomposition with generalized lasso penalties 

Authors: Oscar Hernan Madrid Padilla, James G. Scott 

Abstract: We present an approach for penalized tensor decomposition (PTD) that estimates smoothly varying latent 
factors in multi-way data. This generalizes existing work on sparse tensor decomposition and penalized matrix 
decompositions, in a manner parallel to the generalized lasso for regression and smoothing problems. Our approach 
presents many nontrivial challenges at the intersection of modeling and c... v More 
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Priors for Random Count Matrices Derived from a Family of Negative Binomial Processes 
Authors: Mingyuan Zhou, Oscar Hernan Madrid Padilla, James G. Scott 


Abstract: We define a family of probability distributions for random count matrices with a potentially unbounded 
number of rows and columns. The three distributions we consider are derived from the gamma-Poisson, gamma- 
negative binomial, and beta-negative binomial processes. Because the models lead to closed-form Gibbs sampling 
update equations, they are natural candidates for nonparametric Bayesian priors... v More 
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